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Abstract.  It  is  known  that  any  quantum  algorithm  for  Graph  Isomor¬ 
phism  that  works  within  the  framework  of  the  hidden  subgroup  problem 
(HSP)  must  perform  highly  entangled  measurements  across  fl(nlogn) 
coset  states.  One  of  the  only  known  models  for  how  such  a  measurement 
could  be  carried  out  efficiently  is  Kuperberg’s  algorithm  for  the  HSP  in 
the  dihedral  group,  in  which  quantum  states  are  adaptively  combined 
and  measured  according  to  the  decomposition  of  tensor  products  into  ir¬ 
reducible  representations.  This  “quantum  sieve”  starts  with  coset  states, 
and  works  its  way  down  towards  representations  whose  probabilities  dif¬ 
fer  depending  on,  for  example,  whether  the  hidden  subgroup  is  trivial  or 
nontrivial. 

In  this  paper  we  show  that  no  such  approach  can  produce  a  polyno¬ 
mial-time  quantum  algorithm  for  Graph  Isomorphism.  Specifically,  we 
consider  the  natural  reduction  of  Graph  Isomorphism  to  the  HSP  over 
the  the  wreath  product  Snl1i2-  Using  a  recently  proved  bound  on  the 
irreducible  characters  of  Sn,  we  show  that  no  algorithm  in  this  family 
can  solve  Graph  Isomorphism  in  less  than  en-ZZ  time,  no  matter  what 
adaptive  rule  it  uses  to  select  and  combine  quantum  states.  In  particular, 
algorithms  of  this  type  can  offer  essentially  no  improvement  over  the 
best  known  classical  algorithms,  which  run  in  time  e°^nl°sn\ 

1.  Introduction 

The  discovery  of  Shor’s  and  Simon’s  algorithms  began  a  frenzied  charge 
to  uncover  the  full  algorithmic  potential  of  a  general  purpose  quantum  com¬ 
puter.  Creative  invocations  of  the  order-finding  primitive  yielded  efficient 
quantum  algorithms  for  a  number  of  other  number-theoretic  problems  [Hal02, 
Hal05].  As  the  field  matured,  these  algorithms  were  roughly  unified  under 
the  general  framework  of  the  hidden  subgroup  problem ,  where  one  must 
determine  a  subgroup  H  of  a  group  G  by  querying  an  oracle  /  :  G  — *•  S 
known  to  have  the  property  that  /(<?)  =  f(gh )  h  e  H.  Solutions  to  this 

general  problem  are  the  foundation  for  almost  all  known  superpolynomial 
speedups  offered  by  quantum  algorithms  over  their  classical  counterparts 
(see  [AJL06]  for  an  important  exception). 

The  algorithms  of  Simon  and  Shor  essentially  solve  the  hidden  subgroup 
problem  on  abelian  groups,  namely  Z)'  and  Z*  respectively.  Since  then, 
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non-abelian  hidden  subgroup  problems  have  received  a  great  deal  of  atten¬ 
tion  (e.g.  [HRTSOO,  GSVV01,  FIM+03,  MRS04,  BCvD05,  HMR+06]).  A 
major  motivation  for  this  work  is  the  fact  that  we  can  reduce  Graph  Isomor¬ 
phism  for  rigid  graphs  of  size  n  to  the  case  of  the  hidden  subgroup  prob¬ 
lem  over  the  symmetric  group  S‘2n,  or  more  specifically  the  wreath  product 
Sn  l  7j2,  where  the  hidden  subgroup  is  promised  to  be  either  trivial  or  of 
order  two.  The  standard  approach  to  these  problems  is  to  prepare  “coset 
states”  of  the  form 

pH  =  j^J2\cH)(cH\  * 

where  (S'),  for  a  subset  S  C  G,  denotes  the  uniform  superposition  (l/y/|S'[)  \d)- 

In  the  abelian  case,  one  proceeds  by  computing  the  quantum  Fourier  trans¬ 
form  of  such  coset  states,  measuring  the  resulting  states,  and  appropriately 
interpreting  the  results.  In  the  case  of  the  symmetric  group,  however,  deter¬ 
mining  H  from  a  quantum  measurement  of  coset  states  is  far  more  difficult. 

In  particular,  no  product  measurement  (that  is,  a  measurement  which  treats 
each  coset  state  independently)  can  efficiently  determine  a  hidden  subgroup 
over  Sn  [MRS05];  in  fact,  any  successful  measurement  must  be  entangled 
over  Q(n  logn)  coset  states  at  once  [HMR+06]. 

One  of  the  few  proposed  methods  for  building  such  an  entangled  mea¬ 
surement  comes  from  Kuperberg’s  algorithm  for  the  hidden  subgroup  prob¬ 
lem  in  the  dihedral  group  [Kup05].  It  starts  by  generating  a  large  number 
of  coset  states  and  subjecting  each  one  to  weak  Fourier  sampling ,  so  that 
it  lies  inside  a  known  irreducible  representation.  It  then  proceeds  with  an 
adaptive  “sieve”  process,  at  each  step  of  which  it  judiciously  selects  pairs 
of  states  and  measures  them  in  a  basis  consistent  with  the  Clebsch-Gordan 
decomposition  of  their  tensor  product  into  irreducible  representations.  This 
sieve  continues  until  we  obtain  a  state  lying  in  an  “informative”  represen¬ 
tation:  namely,  one  from  which  information  about  the  hidden  subgroup  can 
be  easily  extracted.  We  can  visualize  a  run  of  the  sieve  as  a  forest,  where 
leaves  consist  of  the  initial  coset  states,  each  internal  node  measures  the 
tensor  product  of  its  parents,  and  the  informative  representations  lie  at  the 
roots. 

This  approach  is  especially  attractive  in  cases  like  Graph  Isomorphism, 
where  all  we  need  to  know  is  whether  the  hidden  subgroup  is  trivial  or 
nontrivial.  Specifically,  suppose  that  the  hidden  subgroup  H  is  promised 
to  be  either  the  trivial  subgroup  {1}  or  a  conjugate  of  a  known  subgroup 
H0.  Assume  further  that  there  is  an  irreducible  representation  a  of  G  with 
the  property  that  J2heH0  a(p)  =  0;  that  is,  a  “missing  harmonic”  in  the 
sense  of  [MR05a].  In  this  case,  if  II  is  nontrivial  then  the  probability  of 
observing  a  under  weak  Fourier  sampling  of  the  coset  state  pu  is  zero. 
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More  generally,  as  we  discuss  below,  the  irrep  a  cannot  appear  at  any  time 
in  the  sieve.  If,  on  the  other  hand,  one  can  guarantee  that  the  sieve  does 
observe  a  with  significant  probability  when  the  hidden  subgroup  is  trivial 
and  the  corresponding  states  are  completely  mixed,  it  gives  us  an  algorithm 
to  distinguish  the  two  cases. 

For  example,  if  we  consider  the  case  of  the  hidden  subgroup  problem 
in  the  dihedral  group  Dn  where  H  is  either  trivial  or  a  conjugate  of  H0  = 
{1,  m}  where  m  is  an  involution,  then  the  sign  representation  7r  is  a  miss¬ 
ing  harmonic.  Applying  Kuperberg’s  sieve,  we  observe  n  with  significant 
probability  after  steps  if  H  is  trivial,  while  we  can  never  observe  it 

if  H  is  of  order  2.  A  similar  approach  was  applied  to  groups  of  the  form  Gn 
by  Alagic  et  al.  [AMR06]. 

We  show  here,  however,  that  the  hidden  subgroup  problem  related  to 
Graph  Isomorphism  cannot  be  solved  efficiently  by  any  algorithm  in  this 
family.  Specifically,  no  matter  what  adaptive  selection  rule  it  uses  to  choose 
pairs  of  states  to  combine  and  measure,  such  a  sieve  cannot  distinguish  the 
isomorphic  and  nonisomorphic  cases  unless  it  takes  time  (and  uses 

this  many  coset  states).  In  comparison,  the  best  known  classical  algorithms 
for  Graph  Isomorphism  run  in  time  e0(V™iogn)  for  general  graphs  [Bab80, 
Bab83]  and  e°(nl/,i  los2 n)  for  strongly  regular  graphs  [Spi96].  Therefore, 
quantum  algorithms  of  this  kind  can  offer  no  meaningful  improvement  over 
their  classical  counterparts. 

Our  proof  relies  on  several  ingredients.  First,  we  give  a  formal  definition 
of  quantum  sieve  algorithms,  and  we  derive  a  combinatorial  description  of 
the  probability  distributions  of  their  observations  in  the  trivial  and  nontrivial 
cases.  We  then  focus  on  the  case  where  the  ambient  group  is  a  wreath  prod¬ 
uct  GlZ2,  and  show  that  no  information  is  gained  until  the  sieve  observes  a 
so-called  inhomogeneous  representation.  Then,  in  the  case  where  G  =  Sn, 
we  rely  on  a  bound  on  the  characters  of  the  symmetric  group  proved  very 
recently  by  Rattan  and  Sniady  [RS06]  to  show  that  the  total  variation  dis¬ 
tance  between  the  trivial  and  nontrivial  cases  is  at  most  e~b unless  the 
sieve  takes  eay^  time,  for  some  constants  a,  b  >  0. 

We  note  that  two  of  the  present  authors  gave  this  result  in  conditional 
form  in  [MR06],  in  which  they  presented  a  conjectured  bound  on  the  char¬ 
acters  of  Sn.  Indeed,  it  was  this  conjecture  which  inspired  the  work  of  [RS06] 
who  proved  its  weaker  version,  which,  along  with  some  additional  argu¬ 
ments,  allows  us  to  prove  the  results  of  [MR06]  unconditionally. 

We  refer  the  reader  to  [Ser77,  JK81]  for  an  introduction  to  to  the  repre¬ 
sentation  theory  of  finite  groups,  and  in  particular  of  the  symmetric  group 
Sn.  One  fact  which  we  use  repeatedly  is  that  the  r-isotypic  subspace,  i.e., 
the  subspace  of  a  representation  a  which  consists  of  copies  of  an  irrep  r,  is 
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the  image  of  the  projection  operator 


g&G 


These  projection  operators  can  be  combined  to  create  a  measurement  whose 
outcomes  are  names  of  irreducible  representations.  Applying  such  a  mea¬ 
surement  to  coset  states  is  known  as  weak  Fourier  sampling;  we  use  the  term 
isotypic  sampling  to  refer  to  the  more  general  case  of  applying  an  arbitrary 
group  action  to  a  multiregister  state. 


2.  Fourier  analysis  on  finite  groups 


In  this  section  we  review  the  representation  theory  of  finite  groups.  Our 
treatment  is  primarily  for  the  purposes  of  setting  down  notation;  we  refer 
the  reader  to  [Ser77]  for  a  complete  account.  Let  G  be  a  finite  group.  A 
representation  o  of  G  is  a  homomorphism  a  :  G  — >•  U(V),  where  V  is  a 
finite-dimensional  Hilbert  space  and  U(V)  is  the  group  of  unitary  operators 
on  V.  The  dimension  of  o,  denoted  da,  is  the  dimension  of  the  vector  space 
V.  Fixing  a  representation  o  :  G  — >  U(V),  we  say  that  a  subspace  W  C  V 
is  invariant  if  a(g)  ■  W  =  W  for  all  g  e  G.  When  cr  has  no  invariant 
subspaces  other  than  the  trivial  subspace  {0}  and  V  itself,  a  is  said  to  be 
irreducible. 

If  two  representations  a  and  o'  are  the  same  up  to  a  unitary  change  of 
basis,  we  say  that  they  are  equivalent.  It  is  a  fact  that  any  finite  group  G  has 
a  finite  number  of  distinct  irreducible  representations  up  to  equivalence  and, 
for  a  group  G,  we  let  G  denote  a  set  of  representations  containing  exactly 
one  from  each  equivalence  class.  We  often  say  that  each  o  G  G  is  the  name 
of  an  irreducible  representation,  or  an  irrep  for  short. 

The  irreps  of  G  give  rise  to  the  Fourier  transform.  Specifically,  for  a 
function  /  :  G  — >  C  and  an  element  a  G  G,  define  the  Fourier  transform  of 
f  at  o  to  be 


/O)  =  \  . 

V  ICtI  r,cn 


The  leading  coefficients  are  chosen  to  the  make  the  transform  unitary,  so 
that  it  preserves  inner  products: 


(/i,/2)  =  ^ft(g)f2{g)  =  X^tr (/1(<r)t  •  /2(<r)) 
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If  a  is  not  irreducible,  it  can  be  decomposed  into  a  direct  sum  of  irreps  rt, 
each  of  which  acts  on  an  invariant  subspace,  and  we  write  o  =  T\  ©  •  •  •  ©tl- 
In  general,  a  given  r  can  appear  multiple  times  in  this  decomposition,  in 
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the  sense  that  a  may  have  an  invariant  subspace  isomorphic  to  the  direct 
sum  of  aT  copies  of  r.  In  this  case  aT  is  called  the  multiplicity  of  r  in  the 
decomposition  of  a. 

There  is  a  natural  product  operation  on  representations:  if  A  :  G  — > 
U(V)  and  p  :  G  — >  U(I¥)  are  representations  of  G,  we  may  define  a 
new  representation  A  ®  /i  :  G  ->  U(f  ®  W)  as  (A  0  p){g)  '■  u  0  v  i— > 
X(g)u  0  //(/y)v.  This  representation  corresponds  to  the  diagonal  action  of 
G  on  V  0  W,  in  which  we  apply  the  same  group  element  to  both  parts  of 
the  tensor  product.  In  general,  the  representation  A  0  p  is  not  irreducible, 
even  when  both  A  and  p  are.  This  leads  to  the  Clebsch-Gordan  problem, 
that  of  decomposing  A  0  p  into  irreps. 

Given  a  representation  a  we  define  the  character  of  cr,  denoted  ya,  to 
be  the  trace  Xo-(g)  —  tr  a(g).  As  the  trace  of  a  linear  operator  is  invariant 
under  conjugation,  characters  are  constant  on  the  conjugacy  classes  of  G. 
Characters  are  a  powerful  tool  for  reasoning  about  the  decomposition  of 
reducible  representations.  In  particular,  when  a  =  0;  t,  we  have  Xa  — 
Xn  and,  moreover,  for  cr,  r  G  G,  we  have  the  orthogonality  conditions 


Therefore,  given  a  representation  a  and  an  irrep  r,  the  multiplicity  aT  with 
which  r  appears  in  the  decomposition  of  a  is  {xT,  Xo)g-  ^or  example>  since 
Xxm(g)  =  Xx(g)  ■  Xn(g)> the  multiplicity  of  r  in  the  Clebsch-Gordan  de¬ 
composition  of  A  0  p  is  (xr,  AAA>)g- 

A  representation  a  is  said  to  be  isotypic  if  the  irreducible  factors  appear¬ 
ing  in  the  decomposition  are  all  isomorphic,  which  is  to  say  that  there  is  a 
single  nonzero  aT  in  the  decomposition  above.  Any  representation  a  may 
be  uniquely  decomposed  into  maximal  isotypic  subspaces,  one  for  each  ir¬ 
rep  t  of  G\  these  subspaces  are  precisely  those  spanned  by  all  copies  of  r 
in  cr.  In  fact,  for  each  r  this  subspace  is  the  image  of  an  explicit  projection 
operator  IL  which  can  be  written  as 
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A  useful  fact  is  that  IIr  commutes  with  the  group  action;  that  is,  for  any 
h  e  G  we  have 

a{h)IlT(j{h)]  =  -^-S^drxAgTvIyhgh-1)  = 

|Cr|  g&G 

^y^drXrih^ghY a(g)  =  ^  dTxT(g)*cr(g)  =  H r  ■ 

I  I  g£G  '  '  g&G 

Our  algorithms  will  perform  measurements  which  project  into  these  max¬ 
imal  isotypic  subspaces  and  observe  the  resulting  irrep  name  r.  For  the  par¬ 
ticular  case  of  coset  states,  this  measurement  is  called  weak  Fourier  sam¬ 
pling  in  the  literature;  however,  since  we  are  interested  in  a  more  general 
process  which  in  fact  performs  a  kind  of  strong  multiregister  sampling  on 
the  original  coset  states,  we  will  use  the  term  isotypic  sampling  instead. 
Finally,  we  discuss  the  structure  of  a  specific  representation,  the  ( right)  reg¬ 
ular  representation  reg,  which  plays  an  important  role  in  the  analysis  below, 
reg  is  given  by  the  permutation  action  of  G  on  itself.  Specifically,  let  C[G] 
be  the  group  algebra  of  G;  this  is  the  \G\ -dimensional  vector  space  of  for¬ 
mal  sums 

{J2a9'g  I  aa  e  C}  ' 

g 

(Note  that  C[G]  is  precisely  the  Hilbert  space  of  a  single  register  containing 
a  superposition  of  group  elements.)  Then  reg  is  the  representation  reg  : 
G  — >  U('C[G'])  given  by  linearly  extending  right  multiplication,  reg(g)  : 
h  i— >  hg.  It  is  not  hard  to  see  that  its  character  y  ros  is  given  by 

\G\  g  =  1  , 
o  9±  1  , 

in  which  case  we  have  (Xreg?  X<?)G  =  da  for  each  a  G  G.  Thus  reg  contains 
du  copies  of  each  irrep  o  G  G,  and  counting  dimensions  on  each  side  of  this 
decomposition  implies 

(1)  \G\=J2dl  . 

o-eG 

This  equation  suggests  a  natural  probability  distribution  on  G,  the  Planche- 
rel  distribution ,  which  assigns  to  each  irrep  a  the  probability  Ppjanch  {a)  = 
d2a/\G\.  This  is  simply  the  dimensionwise  fraction  of  C[G]  consisting  of 
copies  of  o;  indeed,  if  we  perform  isotypic  sampling  on  the  completely 
mixed  state  on  C[G],  or  equivalently  the  coset  state  where  the  hidden  sub¬ 
group  is  trivial,  we  observe  exactly  this  distribution. 
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In  general,  we  can  consider  subspaces  of  C[G]  that  are  invariant  under  left 
multiplication,  right  multiplication,  or  both;  these  subspaces  are  called  left-, 
right-,  or  bi-invariant  respectively.  For  each  a  e  G,  the  maximal  cr-isotypic 
subspace  is  a  (/^-dimensional  bi-invariant  subspace;  it  can  be  broken  up 
further  into  da  (/^-dimensional  left-invariant  subspaces,  or  (transversely)  da 
(/^-dimensional  right-invariant  subspaces.  However,  this  decomposition  is 
not  unique.  If  a  acts  on  a  vector  space  V,  then  choosing  an  orthonormal 
basis  for  V  allows  us  to  view  a(g)  as  a  da  x  da  matrix.  Then  a  acts  on 
the  (/^-dimensional  space  of  such  matrices  by  left  or  right  multiplication, 
and  the  columns  and  rows  correspond  to  left-  and  right-invariant  spaces 
respectively. 


3.  Clebsch-Gordan  sieves 

Consider  the  hidden  subgroup  problem  over  a  group  G  with  the  added 
promise  that  the  hidden  subgroup  H  is  either  the  trivial  subgroup,  or  a 
conjugate  of  some  fixed  nontrivial  subgroup  H0.  We  shall  consider  sieve 
algorithms  for  this  problem  that  proceed  as  follows: 

1.  The  oracle  is  used  to  generate  t  =  t(ri)  coset  states  pn,  each  of  which 
is  subjected  to  weak  Fourier  sampling.  This  results  in  a  set  of  states  pt, 
where  pi  is  a  mixed  state  known  to  lie  in  the  (T,-isotypic  subspace  of  C|G'] 
for  some  irrep  a, . 

2.  The  following  combine-and-measure  procedure  is  then  repeated  as 
many  times  as  we  like.  Two  states  pi  and  p:!  in  the  set  are  selected  accord¬ 
ing  to  an  arbitrary  adaptive  rule  that  may  depend  on  the  entire  history  of 
the  computation  (in  existing  algorithms  of  this  type,  this  selection  in  fact 
depends  only  on  the  irreps  a,  and  Oj  in  which  they  lie).  We  then  perform 
isotypic  sampling  on  their  tensor  product  p,  ®  pf  that  is,  we  apply  a  mea¬ 
surement  operator  which  observes  an  irrep  a  in  the  Clebsch-Gordan  decom¬ 
position  of  cu  ®  ctj  (see  [Kup05]  or  [MR05a]  for  how  this  measurement  can 
actually  be  carried  out  by  applying  the  diagonal  action).  This  measurement 
destroys  pi  and  pj,  and  results  in  a  new  mixed  state  p  which  lies  in  the 
maximal  cr-isotypic  subspace;  we  add  this  new  state  to  the  set. 

3.  Finally,  depending  on  the  sequence  of  observations  obtained  through¬ 
out  this  process,  the  algorithm  guesses  the  hidden  subgroup. 

We  set  down  some  notation  to  discuss  the  result  of  applying  such  an  al¬ 
gorithm.  Fixing  a  group  G  and  a  subgroup  H,  let  A  be  a  sieve  algorithm 
which  initially  generates  £  coset  states.  As  a  bookkeeping  tool,  we  will  de¬ 
scribe  intermediate  states  of  A’s  progress  as  a  forest  of  labeled  binary  trees. 
Throughout,  we  will  maintain  the  invariant  that  the  roots  of  the  trees  in  this 
forest  correspond  to  the  current  set  of  states  available  to  the  algorithm. 
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Initially,  the  state  of  the  algorithm  consists  of  a  forest  consisting  of  t 
single-node  trees,  each  of  which  is  labeled  with  the  irrep  name  a*  that  re¬ 
sulted  from  weak  Fourier  sampling  a  coset  state,  and  is  associated  with  the 
resulting  state  pi.  Then,  each  combine-and-measure  step  selects  two  root 
nodes,  r1  and  r2,  and  applies  isotypic  sampling  to  the  tensor  product  of 
their  states.  We  associate  the  resulting  state  p  with  a  new  root  node  r,  and 
place  the  nodes  ri  and  r2  below  it  as  its  children.  We  label  this  new  node 
with  the  irrep  name  a  observed  in  this  measurement. 

Thus,  every  node  of  the  forest  corresponds  to  a  state  that  existed  at  some 
point  during  the  algorithm,  and  each  node  i  is  labeled  with  the  name  of  the 
irrep  observed  in  the  isotypic  measurement  performed  when  that  node 
was  created.  We  call  the  resulting  labeled  forest  the  transcript  of  the  al¬ 
gorithm:  note  that  this  transcript  contains  all  the  information  the  algorithm 
may  use  to  determine  the  hidden  subgroup. 

We  make  several  observations  about  algorithms  of  this  type.  First,  it  is 
easy  to  see  that  nothing  is  gained  by  combining  t  >  2  states  at  a  time;  we 
can  simulate  this  with  an  algorithm  which  builds  a  binary  tree  with  t  leaves, 
and  which  ignores  the  results  of  all  its  measurements  except  the  one  at  the 
root. 

Second,  the  algorithm  maintains  the  following  kind  of  symmetry  under 
the  action  of  the  subgroup  H.  Suppose  we  have  a  representation  a  acting 
on  a  Hilbert  space  V.  Given  a  subgroup  H,  we  say  that  a  state  ip  G  V  is 
H -invariant  if  a(h)  -  ip  =  ip  for  all  h  €  H.  Similarly,  given  a  mixed  state  p, 
we  say  that  p  is  //-invariant  if  a(h)  ■  p  ■  a(h )'*'  =  p  or,  equivalently,  if  a(h) 
and  p  commute.  For  instance,  the  coset  state  pn  is  //-invariant  under  the 
right  regular  representation,  since  right-multiplying  by  any  h  G  H  preserves 
each  left  coset  cH.  Now,  suppose  that  p1  and  p2  are  //-invariant;  clearly 
Pi  <8>  p2  is  //-invariant  under  the  diagonal  action,  and  performing  isotypic 
sampling  preserves  //-invariance  since  Ilr  commutes  with  the  action  of  any 
group  element.  Thus  the  states  produced  by  the  algorithm  are  //-invariant 
throughout. 

Third,  it  is  important  to  note  that  while  at  each  stage  we  observe  only  an 
irrep  name,  rather  than  a  basis  vector  inside  that  representation,  by  iterating 
this  process  the  sieve  algorithm  actually  performs  a  kind  of  strong  multi¬ 
register  Fourier  sampling  on  the  original  set  of  coset  states.  For  instance, 
in  the  dihedral  group,  suppose  that  performing  weak  Fourier  sampling  on 
two  coset  states  results  in  the  two-dimensional  irreps  a3  and  ak ,  and  that  we 
then  observe  the  irrep  crJ+k  under  isotypic  sampling  of  their  tensor  product. 
We  now  know  that  the  original  coset  states  were  in  fact  confined  to  a  partic¬ 
ular  subspace,  spanned  by  two  entangled  pairs  of  basis  vectors.  Finally,  we 
note  that  the  states  produced  by  a  sieve  algorithm  are  quite  different  from 
coset  states.  In  particular,  they  belong  not  to  a  maximal  isotypic  subspace 
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of  C[Cr],  but  to  a  (typically  much  higher-dimensional)  non-maximal  isotypic 
subspace  of  C[G]0^,  where  l  is  the  number  of  coset  states  feeding  into  that 
state  (i.e.,  the  number  of  leaves  of  the  corresponding  tree).  Moreover,  they 
have  more  symmetry  than  coset  states,  since  each  isotypic  measurement 
implies  a  symmetry  with  respect  to  the  diagonal  action  on  the  set  of  leaves 
descended  from  the  corresponding  internal  node.  In  the  next  sections  we 
will  show  how  these  states  can  be  written  in  terms  of  projection  operators 
applied  to  this  high-dimensional  space. 

4.  Observed  distributions  for  fixed  topologies 

In  general,  the  probability  distributions  arising  from  the  combine-and- 
measure  steps  of  a  sieve  algorithm  depend  on  both  the  hidden  subgroup  and 
the  entire  history  of  previous  measurements  and  observations  (that  is,  the 
labeled  forest,  or  transcript,  describing  the  algorithm’s  history  thus  far).  In 
this  section  and  the  next,  we  focus  on  the  probability  distribution  induced 
by  a  fixed  forest  topology  and  subgroup  H.  We  can  think  of  this  either  as  the 
probability  distribution  conditioned  on  the  forest  topology,  or  as  the  distri¬ 
bution  of  transcripts  produced  by  some  non-adaptive  sieve  algorithm,  which 
chooses  which  states  it  will  combine  and  measure  ahead  of  time.  We  will 
show  that  for  all  forest  topologies  of  sufficiently  small  size,  the  induced  dis¬ 
tributions  on  irrep  labels  fail  to  distinguish  trivial  and  nontrivial  subgroups. 
Then,  in  Section  7,  we  will  complete  the  argument  for  adaptive  algorithms. 
Clearly,  in  this  non-adaptive  case  the  distributions  of  irrep  labels  associated 
with  different  trees  in  the  forest  are  independent.  Therefore,  we  can  focus 
on  the  distribution  of  labels  for  a  specific  tree.  At  the  leaves,  the  labels  are 
independent  and  identically  distributed  according  to  the  distribution  result¬ 
ing  from  weak  Fourier  sampling  a  coset  state  [HRTSOO].  However,  as  we 
move  inside  the  tree  and  condition  on  the  irrep  labels  observed  previously, 
the  resulting  distributions  are  quite  different  from  this  initial  one.  To  calcu¬ 
late  the  resulting  joint  probability  distribution,  we  need  to  define  projection 
operators  acting  on  C[G]0^  corresponding  to  the  isotypic  measurement  at 
each  node. 

First,  note  that  the  coset  state  pH  can  be  written  in  the  following  conve¬ 
nient  form: 


where  reg  is  the  right  regular  representation :  that  is,  pH  is  proportional  to 
the  projection  operator  which  right-multiplies  by  a  random  element  of  H , 
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If  H  is  trivial,  pH  is  the  completely  mixed  state  p{  1}  =  (l/|G|)l.  On  the 
other  hand,  if  H  —  {l,m}  for  an  involution  m,  then  pH  =  (2/|Gj)n^, 
where  UH  is  the  projection  operator 

nH  =  ^(1  +  reg(m))  . 

Now  consider  the  tensor  product  of  £  “registers,”  each  containing  a  coset 
state.  Given  a  linear  operator  M  on  C[G|  and  a  subset  /C[£]  =  {1,. 
let  M1  denote  the  operator  on  C[G"  =  C \G]®£  which  applies  M  to  the 
registers  in  I  and  leaves  the  other  registers  unchanged.  Then  the  mixed 
state  consisting  of  £  independent  coset  states  is  pff  =  (2/|G|)^IIff ,  where 

(2)  nff  =  ^  JJ(1  +  reg 4  res(mY  ■ 

3= 1  IC[£\ 

Note  the  sum  over  subsets  of  registers,  a  theme  which  has  appeared  re¬ 
peatedly  in  discussions  of  multiregister  Fourier  sampling  [Reg02,  BCvD06, 
HMR+06,  Kup05,  MR05a,  MR05b].  Now  consider  a  tree  T  with  £  leaves 
corresponding  to  the  £  initial  registers,  and  k  nodes  including  the  leaves.  We 
represent  this  tree  as  a  set  system,  in  which  each  node  i  is  associated  with 
the  subset  I%  C  [£\  of  leaves  descended  from  it.  In  particular,  /root  =  [£]  and 
Ij  =  {  j }  for  each  leaf  j. 

Performing  isotypic  sampling  at  a  node  i  corresponds  to  applying  the 
diagonal  action  to  its  children  (or  in  terms  of  the  algorithm,  its  parents) 
and  inductively  to  the  registers  in  Ip  that  is,  we  multiply  each  register  in 
/,  by  the  same  element  g  and  leave  the  others  fixed.  If  at  is  the  irrep  label 
observed  at  that  node,  let  us  denote  its  character  and  dimension  by  Xi  and 
di  respectively,  rather  than  the  more  cumbersome  Xn,  and  dai.  Then  the 
projection  operator  corresponding  to  this  observation  is 

(3)  =  TFT\  d^i(gT  reg (g)U  ■ 

11  sec 

Now  consider  a  transcript  of  the  sieve  process  which  results  in  observing  a 
set  of  irrep  labels  er  =  { at }  on  the  internal  nodes  of  the  tree.  The  projection 
operator  associated  with  this  outcome  is 

(4)  nr[cr]  =  Rnf  . 

i= 1 

We  will  abbreviate  this  as  I1T  whenever  the  context  is  clear.  Note  that  the 
various  Ilf  in  the  product  (4)  pairwise  commute,  since  for  any  two  nodes 
i,j  either  /,  and  Ij  are  disjoint,  or  one  is  contained  in  the  other.  In  the 
former  case  a L‘  and  bIj  for  all  a,  b.  In  the  latter  case,  say  if  /,  C  Ij,  we  have 
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a:ibIj  =  bIj(b  1ab)Ii,  and  since  Xi{b  lab)  =  Xi(a)  it  follows  from  (3)  that 
ll'll'  II7  II' . 

I  J  J  l 

Given  a  tree  T  with  k  nodes,  we  write  P^ '  [cr]  for  the  probability  that 
we  observe  the  set  of  irrep  labels  cr  =  { a, }  in  the  case  where  the  hidden 
subgroup  is  trivial.  Since  the  tensor  product  of  coset  states  is  then  the  com¬ 
pletely  mixed  state  in  C[G£],  this  is  simply  the  dimensionwise  fraction  of 
C[G£]  consisting  of  the  image  of  IIT,  or 

4“m  =  |^j,rnT  ■ 

Moreover,  since  measuring  a  completely  mixed  state  results  in  the  com¬ 
pletely  mixed  state  in  the  observed  subspace,  each  state  produced  by  the 
algorithm  is  completely  mixed  in  the  image  of  nT.  In  particular,  if  the  irrep 
label  at  the  root  of  a  tree  is  a,  the  corresponding  state  consists  of  a  classical 
mixture  across  some  number  of  copies  of  cr,  in  each  of  which  it  is  com¬ 
pletely  mixed.  Thus,  when  combining  two  parent  states  with  irrep  labels  A 
and  /i,  we  observe  each  irrep  r  with  probability  equal  to  the  dimensionwise 
fraction  of  A  <g)  p  consisting  of  copies  of  r,  namely 


(5) 


T'Aig ,  ,  (Xt,  X\X/j,)g 


(recall  that  (Xt,Xp)g  =  (VIGI)  TtgeGXr(9)x*p(9)  is  the  multiplicity  of  r 
in  the  decomposition  of  a  representation  p  into  irreducibles).  We  will  refer 
to  this  as  the  natural  distribution  in  A  0  /i. 

Now  let  us  consider  the  case  where  the  hidden  subgroup  is  nontrivial. 
Since  the  mixed  state  pHt  can  be  thought  of  as  a  pure  state  chosen  randomly 
from  the  image  of  Ilf^,  the  probability  of  observing  a  set  of  irrep  labels  cr 
in  this  case  is 


trnTnff 

trlif 


triFng^ 


where  we  use  the  fact  that  trllf^  =  [G  :  H]c  =  (|G|/2)£.  Below  we  abbre¬ 
viate  these  distributions  as  P-j ' '  and  P[r  whenever  the  context  is  clear.  Our 
goal  is  to  show  that,  until  the  tree  T  is  deep  enough,  these  two  distributions 
are  extremely  close,  so  that  the  algorithm  fails  to  distinguish  subgroups  of 
the  form  (1,  m}  from  the  trivial  subgroup. 

Now  let  us  derive  explicit  expressions  for  I^1'1  and  P1/ .  First,  we  fix 
some  additional  notation.  Given  an  assignment  of  group  elements  {a;}  to 
the  nodes,  for  each  leaf  j  we  let  .  a*  denote  the  product  of  the  elements 
along  the  path  from  the  root  to  j: 


\ai=  CLi 

i~+j  i-jdi 
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where  the  product  is  taken  in  order  from  to  the  root  to  the  leaf.  Then  us¬ 
ing  (3)  and  (4)  we  can  write 

1  *  (  \W 

(6)  ni  =  Jg\*  YdiXi(ai^  <$!)  reg  11°*  I 

I  ^  K}  3=1  \i~*j  / 

We  say  that  an  assignment  {a*}  is  trivial  if  n,;^  a,  =  1  for  every  leaf  j. 
Then,  since  tr  reg(g)  =  Xreg(ff)  =  |G|  if  <7  =  1  and  0  otherwise,  we  have 

k 

(7)  ^1}  =  E^trnr  =  ^T  S  fldiXM*  - 

II  11  {a;}  trivial  i=l 

To  get  a  sense  of  how  this  expression  scales,  note  that  the  particular  trivial 
assignment  where  at  —  1  for  all  i  contributes  nf=i  d%/\G\  =  IL  ^planch (cr*), 
as  if  the  cr,  were  independent  and  Plancherel-distributed. 

Now  consider  Pfi .  Combining  (2)  with  (6)  gives  the  following  expres¬ 
sion  for  rTnff: 

i 

diXi(ai)*(^)  reg 

3=1 


(8)  nTn^  = 


2i\G\> 


E 

K1 


We  say  that  an  assignment  {a,}  is  legal  if  ■  a*  £  {1,  m}  for  every  leaf 
j.  Then  the  trace  of  the  term  corresponding  to  {a*}  is  \G\£  if  {a,}  is  legal, 
and  is  0  otherwise,  and  analogous  to  (7)  we  have 


(9) 


PH  - 

rT  ~ 


\G[ 


■  tr  ni  = 


i 

w 


y  n** 

{a;}  legal  i= 1 


fix) 


Thus  these  two  distributions  differ  exactly  by  the  terms  corresponding  to 
assignments  which  are  legal  but  nontrivial.  Our  main  result  will  depend  on 
the  fact  that  for  most  cr  these  terms  are  identically  zero,  in  which  case  Pr/ 
and  P^  coincide. 


5.  The  importance  of  being  homogeneous 

For  any  group  G,  the  wreath  product  G  l  Z2  is  the  semidirect  product 
(G  x  G)  x  Z2,  where  we  extend  G  x  G  by  an  involution  which  exchanges 
the  two  copies  of  G.  Thus  the  elements  ((a,  /?),  0)  form  a  normal  subgroup 
K  =  G  x  G  of  index  2,  and  the  elements  ((a,  T).  1)  form  its  nontrivial 
coset.  We  will  call  these  elements  “non-flips”  and  “flips,”  respectively.  The 
Graph  Isomorphism  problem  reduces  to  the  hidden  subgroup  problem  on 
Sn  l  Z2  in  the  following  natural  way.  We  consider  the  disjoint  union  of  the 
two  graphs,  and  consider  permutations  of  their  2 n  vertices.  Then  Sn  l  Z2 
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is  the  subgroup  of  S2n  which  either  maps  each  graph  onto  itself  (the  non¬ 
flips)  or  exchanges  the  two  graphs  (the  flips).  We  assume  for  simplicity  that 
the  graphs  are  rigid.  Then  if  they  are  nonisomorphic,  the  hidden  subgroup 
is  trivial;  if  they  are  isomorphic,  H  =  { 1 ,  m }  where  m  is  a  flip  of  the 
form  ((a,  a-1),  1),  where  a  is  the  permutation  describing  the  isomorphism 
between  them. 

For  any  group  G,  the  irreps  of  G  lZ2  can  be  written  in  a  simple  way  in 
terms  of  the  irreps  of  G.  It  is  useful  to  construct  them  by  inducing  upward 
from  the  irreps  of  K  =  G  x  G  (see  [Ser77]  for  the  definition  of  an  induced 
representation).  First,  each  irrep  of  K  is  the  tensor  product  A  0  /i  of  two 
irreps  of  G.  Inducing  this  irrep  from  K  up  to  G  gives  a  representation 

<T{a,/^}  =  Indf  (A  (8)  fi) 

of  dimension  2d\dfl.  If  A  ^  /i,  then  this  is  irreducible,  and  cr{A,^>  =  <T{Mia} 
(hence  the  notation).  We  call  these  irreps  inhomogeneous.  Their  characters 
are  given  by 


(10) 


X\(ot)xM  +  Xn{a)X\{P)  if  t  =  0 

0  if  t  =  1 


In  particular,  the  character  of  an  inhomogeneous  irrep  is  zero  at  any  flip. 

On  the  other  hand,  if  A  =  /t,  then  ct{a,a}  decomposes  into  two  irreps 
of  dimension  d\,  which  we  denote  aX  A,  and  crA  A, .  We  call  these  irreps 
homogeneous.  Their  characters  are  given  by 


(ID 


xfA,A}((a>/?)’t) 


X\{<*)X\{P)  if  f  =  0 
±X\{a(3)  if  f  =  1 


In  the  next  section,  we  will  show  that  sieve  algorithms  obtain  precisely  zero 
information  that  distinguishes  hidden  subgroups  of  the  form  { 1,  m}  from 
the  trivial  subgroup  until  it  observes  at  least  one  homogeneous  representa¬ 
tion. 

Suppose  that  the  irrep  labels  a  =  { a, }  observed  during  a  run  of  the  sieve 
algorithm  consist  entirely  of  inhomogeneous  irreps  of  G  l  Z2.  Since  the 
irreps  have  zero  character  at  any  flip,  the  only  trivial  or  legal  assignments 
{a*}  that  contribute  to  the  sums  (7)  and  (9)  are  those  where  each  a,  is  a 
non-flip,  i.e.,  is  contained  in  the  subgroup  K  =  G  x  G.  But  the  product 
of  any  string  of  such  elements  is  also  contained  in  K,  so  if  this  product  is 
in  H  =  (1,  m}  where  m  ^  K,  it  is  equal  to  1.  Thus  any  legal  assignment 
of  this  kind  is  trivial,  the  sums  (7)  and  (9)  coincide,  and  the  probability  of 
observing  cr  is  the  same  in  the  trivial  and  nontrivial  cases.  That  is,  so  long 
as  every  cu  in  cr  is  inhomogeneous, 

PT[tr}=PT}[<r\  ■ 


(12) 
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Our  strategy  will  be  to  show  that  observing  even  a  single  homogeneous 
irrep  is  unlikely,  unless  the  tree  generated  by  the  sieve  algorithm  is  quite 
large.  Moreover,  because  the  two  distributions  coincide  unless  this  occurs, 
it  suffices  to  show  that  this  is  unlikely  in  the  case  where  H  is  trivial.  Now, 
it  is  easy  to  see  that  the  probability  of  observing  a  given  representation  in 
G  l  Z2,  under  either  the  Plancherel  distribution  or  a  natural  distribution, 
factorizes  neatly  into  the  probabilities  that  we  observe  the  corresponding 
pair  of  irreps,  in  either  order,  in  a  pair  of  similar  experiments  in  G.  First, 
the  Plancherel  measure  of  an  inhomogeneous  irrep  cr{A  is 

as)  =  (~^0r  =  2 cM  ■ 

Similarly,  the  probability  that  we  observe  a  homogeneous  irrep  afx  A,  is  the 
probability  of  observing  A  twice  under  the  Plancherel  distribution  in  G,  in 
which  case  the  sign  ±  is  chosen  uniformly: 

04)  r£?ch(<A))  =  =  ^oh(A)2  • 

Now  consider  the  natural  distribution  in  the  tensor  product  of  two  inho¬ 
mogeneous  irreps  <T{a,a'}  and  cr^y}-  The  multiplicity  of  a  given  homoge¬ 
neous  irrep  rj  in  this  tensor  product,  equal  to 

(xfr,T},X{A,A'}A Wl)  > 

factorizes  as  follows 

(Xt,X\Xh)g(Xt,X\'Xh')g  ,  (Xri  Xa Xu') g  (Xt,  Xx'X^g 
2  2 

Thus  the  probability  of  observing  either  a^r  r  j  or  a^r  r,  under  the  natural 
distribution  is 

(15) 

^{A,v} ^y}(a{r,r})  =  2  (^A®/,(r)PA'®M'(r)  +  . 

In  other  words,  the  probability  of  observing  a  homogeneous  irrep  of  G  l  Z2 
is  the  probability  of  observing  the  same  irrep  in  two  natural  distributions 
on  G.  Let  us  denote  the  probability  that  we  observe  the  same  irrep  in  the 
natural  distributions  in  As*D/i  and  A7®// — that  is,  that  these  two  distributions 
collide — as 
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Then  (15)  implies  that  the  total  probability  of  observing  a  homogeneous 
irrep  is 


<16> 


I  vcoU  \ 

'  '  '  \®p.'  ,\'®p.) 


<  max(?JiW,F 


coll  'S 

\®p'  ,\'®p) 


In  the  next  section,  we  show  that  if  A,  /i,  A'  and  p,'  are  typical  irreps  of  Sn, 
then  no  irrep  r  occurs  too  often  in  any  of  these  natural  distributions,  and  so 
the  probability  of  a  collision  is  small. 


6.  Collisions,  smoothness,  and  characters 

Let  us  bound  the  probability  VcoU  =  that  the  natural  distribu¬ 

tions  in  A  <8)  /i  and  A'  0  //  collide.  The  idea  is  that  VcoU  is  small  as  long  as 
both  of  either  or  both  of  these  distributions  is  smooth,  in  the  sense  that  they 
are  spread  fairly  uniformly  across  many  r.  The  following  lemmas  show 
that  this  notion  of  smoothness  can  be  related  to  bounds  on  the  normalized 
characters  of  these  representations.  First,  we  present  a  lemma  which  relates 
the  natural  distribution  in  a  representation  p  to  the  Plancherel  distribution. 

Lemma  1.  Let  p  be  a  ( possibly  reducible )  representation  of  a  group  G,  and 
let  Vp(r)  denote  the  probability  of  observing  an  irrep  r  e  G  under  the 
natural  distribution  in  p.  Let  X  C  G,  and  let  VP(X)  =  ^>f>(T)  and 

"'Cpianch  (X )  =  )Pre  v  df/\G\  denote  the  total  probability  of  observing  an 
irrep  in  X  in  the  natural  and  Plancherel  distributions  respectively.  Then 


p„(x)  <  fpfXX) 


\ 


£ 

Xp(a) 

d,„ 

<?eG 

Proof  In  general,  we  have 


Vp{t)  =  j-  {XtiXp)g 

^-X  ^  ^  dTXr  -i 

rex 


Therefore,  if  we  define 
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then  by  Cauchy-Schwartz  we  have 


V„(X) 


and  by  Schur’s  lemma  we  have 

1  d2  d2 

(lx>  ^-x)q  =  'y  '  {Xti  Xt)g  =  ^  '  pf  =  ^planch(^0 


rex 


|G| 


rex 


|G| 


which  completes  the  proof. 


□ 


Now  we  bound  the  probability  of  a  collision  as  follows. 


Lemma  2.  Given  a  family  of  groups  {Gn},  say  that  an  irrep  A  of  Gn  is 
/(n) -smooth  if 


E 

g£Gn 


x\(g) 

d\ 


<  f(n)  • 


Suppose  that  A  and  p  are  f(n) -smooth.  Then 


peon  < 


maxT  dT 
V  \Gn\ 


'/In  ■ 


Proof  We  write  G  for  Gn  to  conserve  ink.  We  have  Pco11  <  maxr  V\ 0Aj(t). 
Setting  p  =  \®p  and  X  =  {r}  in  Lemma  1  and  applying  Cauchy-Schwartz 
gives 


pcoll  < 


yj max  'Pplanch^ ) 


\ 


E 

geG 


x\(g) 


d\ 

maxT  d. 


X»(g) 


< 


\ 


E 

x\(g) 

d\ 

4 

E 

a f(g) 

d,,. 

9SG 

g£G 

which  completes  the  proof. 


□ 


Now  let  us  focus  on  the  case  relevant  to  Graph  Isomorphism,  where  G  = 
Sn.  Here  we  recall  that  each  irrep  of  the  symmetric  group  Sn  corresponds 
to  a  Young  diagram,  or  equivalently  an  integer  partition  Ai  >  A2  >  •  •  • 
where  A*  =  n.  The  maximum  dimension  of  any  irrep  is  bounded  by  the 
following  result  of  Vershik  and  Kerov: 
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Theorem  3  ([VK85]).  There  is  a  constant  c  >  0  such  that  maxT  dT  < 

e  ~(c/2)^i/nT. 


In  this  case,  Lemma  2  gives 

(17)  VcoU  <  . 

Therefore,  our  goal  is  to  show  that  typical  irreps  of  Sn  are  /(n)- smooth 
where  f(ri)  grows  slowly  enough  with  n,  and  to  show  inductively  that  with 
high  probability  all  the  irreps  we  observe  throughout  the  sieve  are  typical. 
We  do  this  by  defining  a  typical  irrep  as  follows. 

Definition  4.  Let  D  >  e  be  a  fixed  constant,  and  say  that  an  irrep  A  of  Sn 
is  typical  if  the  following  two  conditions  hold  true: 

•  the  height  and  width  of  its  Young  diagram  are  less  than  Dyfn  or,  in 
other  words,  if  the  Young  diagram  is  D-balanced  [Bia98], 

•  the  dimension  d,\  fulfills 

dx  >  e~^lognVri.  . 


To  motivate  this  definition,  and  to  provide  the  base  case  for  our  induction, 
we  show  the  following. 

Lemma  5.  There  are  constants  c  >  0  and  n()  such  that,  if  A  has  n  boxes 
with  n  >  n0  and  A  is  chosen  according  to  the  Plancherel  distribution,  then 
A  is  typical  with  probability  at  least  1  —  e_Cv/™. 


Proof.  Firstly,  we  bound  the  probability  that  A  is  not  .D-balanced.  The 
Robinson-Schensted  correspondence  [Ful97]  maps  permutations  to  Young 
diagrams  in  such  a  way  that  the  uniform  measure  on  Sn  maps  to  the  Plan¬ 
cherel  measure.  In  addition,  the  width  (resp.  height)  of  the  Young  diagram 
is  equal  to  the  length  of  the  longest  increasing  (resp.  decreasing)  subse¬ 
quence.  Therefore,  the  probability  in  the  Plancherel  measure  that  an  irrep 
is  not  typical  is  at  most  twice  the  probability  that  a  random  permutation  has 
an  increasing  subsequence  of  length  w  =  Dyfn. 

The  problem  of  determining  the  typical  size  of  the  longest  increasing 
subsequence  is  known  as  Ulam’s  problem;  it  can  be  solved  using  represen¬ 
tation  theory  [Ker03]  or  by  a  beautiful  hydrodynamic  argument  [AD95], 
and  indeed  this  Lemma  holds  even  if  we  take  D  >  2  in  Definition  4.  Here 
we  content  ourselves  with  an  elementary  bound  for  D  >  e.  By  Markov’s 
inequality,  the  probability  an  increasing  subsequence  of  length  w  =  Dyfn 
is  at  most  the  expected  number  of  such  subsequences,  which  is 


(18) 


\w J  w\ 


Dyjn 


where  we  used  Stirling’s  approximation  w\  >  wwe  w. 
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Secondly,  we  shall  bound  the  probability  that 
(19)  dx  <  e~^lognVri.  . 


The  number  of  irreps  is  the  partition  number 


p(n)  =  (1  +  o(l)) 


4x/3  ■  n 


JV™  <  e5y/n 


where 

6=  n/2/3  7t  ; 

therefore  the  Plancherel  measure  of  the  set  of  irreps  A  of  Sn  for  which  (19) 
holds  true  is  at  most  the  number  of  irreps  times  the  measure  of  a  single  such 
A,  so  this  probability  is  at  most 


(20) 


_  e-u(s/n) 


The  sum  of  the  probabilities  (18)  and  (20)  is  bounded  from  above  by 
e-cy/n  |'or  sufficiently  small  c  >  0  and  for  n  sufficiently  large.  □ 


Given  a  permutation  n,  let  t( it)  denote  the  length  of  the  shortest  sequence 
of  transpositions  whose  product  is  n;  for  instance,  if  n  is  a  single  k- cycle, 
then  t( 7r)  =  k  —  1. 


Lemma  6.  There  is  a  constant  A  such  that,  for  n  sufficiently  large,  the 
normalized  character  of  all  typical  A  obeys 

xaO)  <  (  a 

dx  ~  \Vn) 

for  all  7T  G  Sn  with  t( it)  >  y/ndogn. 


Proof  We  use  the  Mumaghan-Nakayama  formula  for  the  character  [JK81], 
A  ribbon  tile  of  length  A;  is  a  polyomino  of  k  cells,  arranged  in  a  path  where 
each  step  is  up  or  to  the  right.  Given  a  Young  diagram  A  and  a  permutation  it 
with  cycle  structure  fci>fc2>-",a  consistent  tiling  consists  of  removing 
a  ribbon  tile  of  length  k\  from  the  boundary  of  A,  then  one  of  length  fc2,  and 
so  on,  with  the  requirement  that  the  remaining  part  of  A  is  a  Young  diagram 
at  each  step.  Let  hi  denote  the  height  of  the  ribbon  tile  corresponding  to  the 
ith  cycle:  then  the  Murnaghan-Nakayama  formula  states  that 

(2d  »w=En(-»wi 

T  i 

where  the  sum  is  over  all  consistent  tilings  T. 

Clearly  the  number  of  consistent  tilings  is  an  upper  bound  on  | Xa (tt)  I  • 
Now,  we  claim  that  for  any  fixed  k,  the  number  of  possible  locations  for  a 
ribbon  tile  of  length  k  on  the  boundary  of  a  Young  diagram  A  of  size  n  is 
less  than  x/2n.  To  see  this,  associate  each  one  with  the  cell  of  A  which  is 
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Figure  1 .  We  associate  each  possible  location  for  a  ribbon 
tile  of  fixed  length  k  with  a  cell  (shaded)  which  is  above 
the  tile’s  lower  end  and  to  the  left  of  its  upper  end.  The 
resulting  sequence  of  cells  moves  up  and  to  the  right  at  each 
step,  implying  that  the  number  of  locations  is  less  than  y/2n. 

Here  k  =  3. 

directly  above  the  tile’s  lower  end,  and  directly  to  the  left  of  its  upper  end, 
as  shown  in  Figure  1 .  A  little  thought  reveals  that  the  resulting  sequence  of 
cells  has  the  property  that  each  one  is  above  and  to  the  right  of  the  previous 
one.  Therefore,  if  there  are  i  locations,  we  have 

i 

n  >  >  f2/2  . 

2—1 

and  so  i  <  \j2ri.  It  follows  that  the  number  of  ways  to  remove  the  ribbon 
tiles  corresponding  to  the  c(7r)  nontrivial  cycles  is  less  than 

(2 n)c(7r)/2  . 

Moreover,  after  these  ribbon  tiles  are  removed,  the  number  of  consistent 
tilings  of  the  remaining  Young  diagram  is  simply  the  dimension  of  the  cor¬ 
responding  irrep  of  Sn_s^),  which  is  less  than  Ay'|S'ri_s(7r)|  =  \/(n  —  s(7r))!. 
Therefore,  if  A  is  typical  we  have 

Xa(tt)  <  (2 n)cW/y(n  -  s(tt))! 
d\  g— v^iogny^J 

<  2  •  gv^10?"  2c(,r)/2  esh)/2  jjfchl-sh))/2 

<  2  •  e^logn  (V^e)^  n"tW/2  . 
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Here  we  used  the  bound  (n  —  s)\/n\  <  4  •  n~ses,  implied  by  Stirling’s 
approximation,  and  the  facts  that  c(7r)  <  t(7r),  s(7r)  <  2£(7t),  and  f(7r)  = 
s(7r)  —  c(7r).  Finally,  if  t( n)  >  \J nlogn ,  the  term  e^logn  can  be  absorbed 
into  At(^\  and  Lemma  holds  for  any  A  >  \/2e2 .  □ 


Lemma  7  (Rattan  and  Sniady  [RS06]).  For  every  D  >  0  there  exists  a  con¬ 
stant  A'  with  the  following  property.  If  A  is  a  Young  diagram  with  n  boxes 
which  has  at  most  D^fn  rows  and  columns  and  "  E  Sn  is  a  permutation 
then 


(22) 


Xa(tt)  <  / A/max(l,t(7r)2/n) 

d\  V  V™ 


Lemma  8.  All  typical  irreps  A  are  0(1)  -smooth. 


Proof  If  A  is  typical,  then  Lemma  6  implies 


E 


TT&Sn 

t(Tr)>^/n\ogn 


Xa(vt) 


<  Y  (At(7rV*(7r)/2)4  =  Y 


AO 


T£Sn 


7T  eSn 


for  z  =  A4/ n2.  Since  each  n  E  Sn  appears  exactly  once  in  the  product 

[l  +  (12)]  |]l  +  (13)  +  (23)]  •  •  •  [l  +  (In)  +  •  •  •  +  (n  —  1,  n)] 

where  (i,j)  denotes  the  transposition  interchanging  i  and  j,  and  since  each 
product  of  the  summands  provides  a  factorization  of  n  into  a  minimal  num¬ 
ber  of  transpositions,  we  have 


Y  zt(n)  =  (!  +  z)(l  +  2z)  ■  ■  •  (1  +  (n  -  1  )z)  < 

TT&Sn 

eze2z  . . .  e(n_1)2  <  e2n2/2  —  eA4^2 


therefore 

(23) 


E 

ir&Sn 

t(ir)>y/n\0g  n 


Xa(tt) 

d\ 


<  eA4/2  . 


Very  similar  but  slightly  more  involved  reasoning  can  be  applied  to  the 
estimate  from  Lemma  7  (for  details  we  refer  to  [RS06])  which  shows  that 
there  exist  constants  E  >  0  and  E'  (which  depend  only  on  D)  with  a  prop¬ 
erty  that  if  a  Young  diagram  A  with  n  boxes  has  at  most  D -Jn  boxes  in  each 
row  and  column  then 


E 

ireSn, 
t(n)<En4: /7 


Xa(tt) 

d\ 


<E'  . 


(24) 
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The  domains  of  the  summations  in  the  inequalities  (23)  and  (24)  cover 
the  whole  group  Sn  for  sufficiently  large  n  which  finishes  the  proof.  □ 


Lemma  9.  There  are  constants  c'  >  0  and  n o  such  that  for  all  pairs  of 
typical  irreps  A  and  p,  if  r  is  chosen  according  to  the  natural  distribution 
then  t  is  typical  with  probability  at  least  1  —  e~d if  n  >  n0. 

Proof  Let  X  be  the  set  of  atypical  representations,  and  let  p  =  A  ®  p.  Then 
applying  Lemma  1  and  Lemma  5,  using  Cauchy-Schwartz  as  in  the  proof 
of  Lemma  2,  and  finally  applying  Lemma  8  gives 


\ 


E 

g&G 


'Px^p(X)  <  xjv planch  (2f) 

<  g-(c/2)\/^  4 

N 

<  e-(c/2)^0(l) 

which  completes  the  proof  for  any  d  <  c/2. 


Xa  (g) 


d\ 


xM 


E 

x\(g) 

dx 

4 

E 

xM 

d... 

gSG 

geG 

□ 


7.  Proof  of  the  main  result 

We  are  now  in  a  position  to  present  our  main  result. 

Theorem  10.  Let  c,  c,  d  be  the  constants  defined  above.  Then  for  any  con¬ 
stants  a,  b  such  that  a  +  b  <  min(c/2,  c,  d),  no  sieve  algorithm  which  com¬ 
bines  less  than  eaA™  coset  states  can  solve  Graph  Isomorphism  with  success 
probability  greater  than  e_6vA 

Proof  We  first  consider  the  behavior  of  a  sieve  algorithm  A  in  the  case 
where  the  hidden  subgroup  ff  C  Sn  I  Z2  is  trivial.  For  convenience,  let 
us  say  that  a  representation  of  Sn  l  Z2  is  typical  if  both  A  and  p  are. 
We  will  establish  that  with  overwhelming  probability,  all  the  irrep  labels 
observed  by  A  are  both  typical  and  inhomogeneous. 

Let  £  be  the  number  of  coset  states  initially  generated  by  the  algorithm. 
We  begin  by  showing  that  with  high  probability,  the  irrep  labels  on  the  £ 
leaves,  i.e.,  those  resulting  from  weak  Fourier  sampling  these  coset  states, 
are  all  both  typical  and  homogeneous.  If  H  is  trivial,  then  these  irrep  labels 
are  Plancherel-distributed;  by  (13)  the  probability  that  a  given  one  fails  to 
be  typical  is  at  most  twice  the  probability  that  a  Plancherel-distributed  irrep 
of  Sn  fails  to  be,  which  by  Lemma  5  is  at  most  e_c^.  Moreover,  by  (14)  the 
probability  that  the  label  of  a  given  leaf  is  homogeneous  is  the  probability 


22 


CRISTOPHER  MOORE,  ALEXANDER  RUSSELL,  AND  PIOTR  SNIADY 


that  we  observe  the  same  irrep  of  Sn  twice  in  two  independent  samples  of 
the  Plancherel  distribution,  which  using  Theorem  3  is 


Thus  the  combined  probability  that  any  of  the  £  leaves  have  a  label  which  is 
not  both  typical  and  inhomogeneous  is  at  most 

(25)  £  (2e~cVE  +  . 

Now,  assume  inductively  that  all  the  irreps  observed  by  the  algorithm 
before  the  Ah  combine-and-measure  step  are  typical  and  inhomogeneous, 
and  that  the  ith  step  combines  states  with  two  such  labels  ct{a,a'}  and 
By  (16),  the  probability  this  results  in  a  homogeneous  irrep  is  bounded  by 
the  probability  VcoU  of  a  collision  between  a  pair  of  natural  distributions  in 
Sn.  Then  Theorem  3  and  Lemmas  2  and  8  and  imply  that  this  probability  is 
bounded  by 

Vc°n  <  0(1) 

In  addition,  Lemma  9  implies  that  the  the  probability  the  observed  irrep 
fails  to  be  typical  is  at  most  e~c  v".  Since  each  combine-and-measure  step 
reduces  the  number  of  states  by  one,  there  are  less  than  £  such  steps;  taking 
a  union  bound  over  all  of  them,  the  probability  that  any  of  the  observed 
irreps  fail  to  be  both  homogeneous  and  typical  is 

(26)  £  (e-(e/2)^0(l)  +  e~c'^  . 

Let  us  call  a  transcript  inhomogeneous  if  all  of  its  irrep  labels  are.  Combin¬ 
ing  (25)  and  (26)  and  setting  £  <  eay^,  we  see  that,  for  n  sufficiently  large, 
A’s  transcript  is  inhomogeneous  with  probability  greater  than  1  —  e~bd™  for 
any  b  <  min(c/2,  c,  d)  —  a. 

Now  consider  A’s  behavior  in  the  case  of  a  nontrivial  hidden  subgroup 
H  —  { 1,  m}.  Inductively  applying  Equation  (12)  shows  that  the  probability 
of  observing  any  inhomogeneous  transcript  is  exactly  the  same  as  it  would 
have  been  if  H  were  trivial.  Thus  the  total  variation  distance  between  the 
distribution  of  transcripts  generated  by  A  in  these  two  cases  is  less  than 
anq  the  theorem  is  proved.  □ 
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